T-cell acute lymphoblastic leukemia (T-ALL) is a hematological malignancy characterized by dysregulation of transcription factor (TF) oncogenes, with activation of TAL1 being the most common. Our group previously discovered a novel mechanism of aberrant TAL1 activation that occurs through somatic mutation of a non-coding site approximately 7.5 kb from the TSS. The mutations are small 2-18 bp indels that create binding motifs for the TF MYB. In Jurkat T-ALL cells, MYB binds to the novel site and nucleates the formation of a complex consisting of TCF12 (HEB), TCF3 (E2A), RUNX1, GATA3, LMO1 and TAL1 itself, creating a neomorphic enhancer that activates TAL1 in cis (Mansour MR et al. 2014). Despite the requirement of MYB to establish the novel enhancer, it remains to be elucidated which members of the complex are required for ongoing enhancer maintenance and oncogenic TAL1 expression, and whether the hierarchy of TFs that constitute the complex consists of previously unrecognized proteins.

To address this, we used CRISPR/Cas9 homology-directed repair (HDR) to create a TAL1-reporter Jurkat cell line where TAL1 has been endogenously GFP-tagged on the same allele as the enhancer mutation. Notably, sgRNAs targeted at the MYB indel mutation led to marked downregulation of GFP, demonstrating the utility of the cell line. Using the GFP-TAL1 cell line, we employed a genome-wide CRISPR/Cas9 knockout screen, targeting 19,114 genes (Doench JG et al. 2016). Cells with reduced GFP expression, indicating TAL1 reduced expression, were sorted for analysis. Accordingly, genomic DNA was extracted and sgRNAs were amplified by PCR and sequenced on an Illumina platform using appropriate primers. The hits of the screen were identified by using a software that quantifies sgRNA enrichment. As expected, POLR2A (log2FC = 4.0, pvalue = 8.9x10-5), TAL1 (log2FC = 2.2, pvalue = 8.8x10-5) and EP300 (log2FC = 1.5, pvalue = 0.002) were amongst the top hits. While sgRNAs targeting MYB were also significantly enriched in the GFP low fraction, surprisingly, those targeting RUNX1, TCF3, TCF12 and LMO1 were not. Amongst the members of the TAL1 complex, GATA3 scored as the strongest activator of TAL1 (log2FC = 1.8, pvalue = 1.2x10-5). We validated GATA3 as a TAL1 enhancer regulator by using two sgRNAs in independent experiments.

Next, we tested the functionality of the GATA3 DNA binding motifs on TAL1 activation. We found that the TAL1 enhancer sequence is comprised of two pairs of GATA3 binding sites (S1-S2, S3-S4), with each pair oriented 'head to-tail'. By employing enhancer-luciferase reporter experiments, we found point mutants in the S1 and S2 GATA motifs reduced reporter activity only modestly. In contrast, mutation of sites S3 and S4 led to a 3-fold reduction of reporter activity compared to the WT sequence, effectively fully abrogating enhancer activity. In keeping with this, DNA pull down assays confirmed strong GATA3 binding to S3 and S4, but not to the S1 and S2 GATA sites. Furthermore, editing of these sites using CRISPR/Cas9 HDR in Jurkat cells confirmed the importance of GATA3 binding at sites S3 and S4 for TAL1 enhancer activation. Utilizing in silico modelling, we propose a novel orientation through which two GATA3 proteins position themselves to cooperatively activate enhancer activity.

Overall, our data provide insights into the mechanisms of enhancer-oncogene regulation in T-ALL. Our results provide evidence that transcription factors follow a complex hierarchy by which enhancers are generated and maintained in disease. In addition, our findings support the view that active TF binding motifs follow an arrangement or code that underlines TF cooperativity that may offer opportunities for therapeutic targeting.

No relevant conflicts of interest to declare.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution